This page contains plotly graphics.

library(tidyverse)
library(p8105.datasets)
library(plotly)

loading Instacart data

data("instacart")


instacart = instacart %>% 
  select(reordered,add_to_cart_order,order_hour_of_day,days_since_prior_order,product_name,order_dow,aisle,department) %>%
  mutate(
    order_hour_of_day = as.factor(order_hour_of_day),
    order_dow = 
           as.factor(recode(order_dow,'0' = "Sun",'1' = "Mon",'2' = "Tue",'3' = "Wed",'4' = "Thu",'5' = "Fri",'6' = "Sat")))

line graph

On which day of the week and on what time of the day are the orders placed the most?

instacart %>%
  group_by(order_dow) %>%
  count(order_hour_of_day) %>%
  mutate(
    text_label = str_c("Time:",order_hour_of_day,"hr","\nOrders:",n),
    order_dow = fct_relevel(order_dow,c("Sun","Mon","Tue","Wed","Thu","Fri","Sat"))) %>%
  plot_ly(
    x = ~order_hour_of_day, y = ~n, type = "scatter", color = ~order_dow, mode = "lines",text = ~text_label,alpha = 0.5) %>%
  layout(
    title = "Orders over 24 Hours")

From the graph, we can see that on Sunday, there will be much more orders placed than the other days of the week. Within one day, most oders are placed between 9 am to 18 pm.

boxplot

Among the products that have been ordered previously, what kinds of products are relatively easy to be consumed? Or what products do people tend to buy regularly without too much stocking-up?

instacart %>%
  filter(reordered == '1',
         department != "missing") %>%
  mutate(department = reorder(department,days_since_prior_order)) %>%
  plot_ly(
    x = ~department, y = ~days_since_prior_order, type = "box", color = "viridis", alpha = 0.5) %>%
  layout(
    title = "Days Since Prior Order between departments")
## Warning in RColorBrewer::brewer.pal(N, "Set2"): minimal value for n is 3, returning requested palette with 3 different levels

## Warning in RColorBrewer::brewer.pal(N, "Set2"): minimal value for n is 3, returning requested palette with 3 different levels

From the graph above, the products from the department of “babies” have a relatively short time interval from the prior order, which could indicate that babies products are the ones that people would like to buy regularly.

bar chart

Which department’s products are the most frequently to be first added into the cart?

instacart %>%
  filter(add_to_cart_order == '1') %>%
  count(department) %>%
  mutate(department = fct_reorder(department, n)) %>% 
  plot_ly(x = ~department, y = ~n, color = ~department, type = "bar", colors = "viridis") %>%
  layout(
    title = "Frequency of being first added to cart")

The bar chart shows that the products from the department of “produce” have the highest frequency of being added first. Inversely, the products of “bulk” tend to be the last addition to the cart.